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Derived-Term Automata of 


Multitape Expressions with Composition* 


Akim DEMAILLE! 


Abstract 


Rational expressions are powerful tools to define automata, but 
often restricted to single-tape automata. Our goal is to unleash their 
expressive power for transducers, and more generally, any multitape 
automaton; for instance (a*|a+bt |y)*. We generalize the construction 
of the derived-term automaton by using expansions. This approach 
generates small automata, and even allows us to support a composition 
operator. 

Keywords: Antimirov, automaton, derivatives, derived term, expan- 
sion, extended rational expressions, transducer, multitape, composition 


1 Introduction 


To compute the edit-distance between words and/or (rational) languages, 
Mohri [20, Figure 4] introduces the following two-tape automaton (aka 
transducer) A, whose weights, written in angle brackets, are in (N, min, +): 


(Ojala, (0)b|b, (Lela, (Leld, (ale, (1)ble 


og 


Ng (21, Figure 2] focuses on the prefix distance and introduces A’: 


“Extended version of Derived-term automata of multitape rational expressions, presented 
at CIAA’16 [9]. 

'EPITA Research and Development Laboratory (LRDE), 14-16, rue Voltaire, 94276 Le 
Kremlin-Bicétre, France, E-mail: akim@lrde.epita.fr. (2017-11-27 19:48:50 +0100 6e2f2c2) 


138 A. Demaille 


(1)ela, (1)e|b, 

(1)ela, (L)elb, ale, (dle, 

(Ojala, (0)blb —(1)ale, (1)ble, (2) ab, (2) bla 
5 ( 


Such automata are tedious to type in. Rational expressions have alre- 
ady proved being extremely concise and handy tools to define automata, 
but rational expressions for multitape automata have received little at- 
tention. Yet an expression such as (ala + b|b+ (1)(1]a + 1|b+ a/1 + d/1))* 
clearly denotes the behavior of A.' Provided that operators such as + 
can be used below the tupling operator |, an more concise expression 
is E = (ala + blb + (1)(1\(a +6) + (a+ 5)|1))*.2 Similarly E’ := (ala + 
b|b)*((1)(1|(a@ + b) + (a + b)|1) + (2) (ab + b]a))* denotes the behavior of A’.? 

Our purpose is to define multitape rational expressions such as E and E’ 
(Section 2) and to introduce an algorithm that computes precisely automata 
A and A’ from them (Section 4). To this end, we rely on an intermediary 
structure, expansions, studied in Section 3. 

Mohri [20] also shows (still in Figure 4) that A is equivalent to the 
composition of the following two simpler transducers, using two new symbols 


to denote Insertion and Suppression*: 


ala, |b, (1)e|Z, (1)alS, (1)b|.S Ta, T\b, Sle, ala, bb 


_.$- eae 


In Section 5 we introduce a composition operator @, such that E is 
equivalent to (ala + b|b+ (1)(1|I + [ab]|S))* @ (ala + b|b + I| [ab] + S]1)*, and 
extend our procedure to support it. It builds A from this expression. 

Various aspects of our proposal are discussed in Section 6, and present 
related works in Section 7. 


The contributions of this paper are: 


™We use € for the empty word, 1 for the expression that denotes the empty word, and 
we now leave the unit weight implicit; 0 in the current case. 

Linguists often simplify (partial) identities such as (ala) + (b|b)(clc)* as a + bc*. With 
such conventions, we could even write ([a—z] + (1)(1|[a—z] + [a—z]|1))*. 

3As a matter of fact, the (2)(a|b + bla) part is useless, as substitution is already scored 
as 1 + 1, one suppression, one insertion. 

“In the paper, the second automaton shows transitions for (0)S|a and (0)S|b which 
were clearly not meant by the author. 
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e we define (weighted) multitape rational expressions featuring operators 
for tupling, |, and for composition, Q; 

e we provide an algorithm to build a concise automaton equivalent to 
such an expression. This algorithm is a generalization of the derived- 
term based algorithms, freed from the requirement that the monoid is 
free. 


The constructs exposed in this paper are implemented in Vcsn®. Vesn is 
a free-software platform dedicated to weighted automata and rational expres- 
sions [10]; its lowest layer is a C++ library, on top of which Python/IPython 
bindings provide an interactive graphical environment. The examples in 
this paper are demonstrated at http://vcsn.1lrde.epita.fr/dload/2.6/ 
notebooks/SACS-2017. html. They can be executed, modified, extended, 
directly from any web browser at http://vcsn-sandbox.1lrde.epita.fr. 


2 Notations 


Our purpose is to define (weighted) multitape rational expressions, such as 
E; := (5)1|1+(4)ade*|x+(3)bde*|x+ (2)ace*|x y+ (6)bce*|zx y (weights are 
written in angle brackets). It relates ade with x, with weight 4. We introduce 
an algorithm to build a multitape automaton (aka transducer) from such 
an expression, e.g., Fig. 1. This algorithm relies on rational expansions. 
They are to the derivatives of rational expressions what differential forms 
are to the derivatives of functions. Defining expansions requires several 
concepts, defined bottom-up in this section. The following figure presents 
these different entities, how they relate to each other, and where we are 
heading to: given a weighted multitape rational expression such as Ej, 


compute its expansion: 
Polynomial 


Bisprcnnin (Section 2.3) 
Weight Label (Section 2.2) Monomial SSS 
(5) ® alz ©](2) © ce*|y ® (4) Ode*|1| © dix © | (6) © ce*|y @ (3) © de*|1 
4 
First Derived term 
4 
(Immediate) constant term (Immediate) proper part of the expansion 


—_—_— << EE 


Expansion (Section 3.1) 


from which we build its derived-term automaton (Fig. 1). 


°Vcsn http://vcsn.1rde.epita.fr. 
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(2)ala, (6) bl (cerly) ely 


Figure 1: The derived-term automaton of E; (see Examples 1 to 4) with 
E, := (5)1/1+ (4)ade*|x + (3)bde*|x4+ (2)ace*|xy + (6)bce*|zy. 


—_ 


(4) ala, (3) bla 


It is helpful to think of expansions as a normal form for expressions. 


2.1 Rational Series 


Series will be used to define the semantics of the forthcoming structures: 
they are to weighted automata what languages are to Boolean automata. 
Not all languages are rational (denoted by an expression), and similarly, 
not all series are rational (denoted by a weighted expression). We follow 
Sakarovitch [24, Chap. III]. 

In order to cope with (possibly) several tapes, we cannot rely on the 
traditional definitions based on the free monoid A* for some alphabet A. 


2.1.1 Labels 


Let M be a monoid (e.g., A* or A* x B*), whose neutral element is denoted 
Ey —or € when clear from the context— and named the empty word. 
For consistency with the way transducers are usually represented, we use 
m|n rather than (m,n) to denote the pair of m and n. For instance 
E Axx Bx = Eax | Epx, and ey |ae M x {a}*. 

A set of generators G of M is a subset of M such that G* = M. By 
G’ we denote {e} UG. A monoid M is of finite type (or finitely generated) 
if it admits a finite set of generators. 


A monoid M is graded if it admits a gradation function |-| ¢ M > N 
such that Vm,n € M, |m| = 0 iff m = «, and |mn| = |m| + |n|. Cartesian 
products of graded monoids are graded, and Cartesian products of finitely 
generated monoids are finitely generated. Free monoids and Cartesian 
products of free monoids are graded and finitely generated. 
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2.1.2 Weights 


Let (K,+,-,0x, 1K) (or K for short) be a semiring whose (possibly non com- 
mutative) multiplication will be denoted by juxtaposition. K is commutative 
if its multiplication is. K is a topological semiring if it is equipped with a 
topology, and both addition and multiplication are continuous. It is strong 
if the product of two summable families is summable. 


2.1.3 Series 


A (formal power) series over M with weights (or multiplicities) in K is a 
map from M to K. The weight of m € M in a series s is denoted s(m). The 
null series, m ++ Ox, is denoted 0; for any m € M (including €),), m denotes 
the series u +> 1g if u=m,0x otherwise. If M is of finite type, then we 
can define the Cauchy product of series. s-t:= m+4 diy yem|uvem 8(¥) ° 
t(v). Equipped with the pointwise addition (s + t = m++ s(m) + t(m)) 
and - as multiplication, the set of these series forms a semiring denoted 
(K(M)),+,:,0,€). 

The constant term of a series s, denoted s¢, is s(€), the weight of the 
empty word. A series s is proper if s; = Ox. The proper part of s is the 
proper series s, such that s = s- + Sp. 


2.1.4 Star 


The star of a series is an infinite sum: s* := )7,,~-y 8”. To ensure semantic 
soundness, we need M to be graded monoid and K to be a strong topological 
semiring. 

We will need a property of star based on the following result. In 
various forms it is named the ‘denesting rule’ [15, p. 57], the ‘property S’ 
(24, Propositions III.2.5 and III.2.6], or the ‘sum-star equation’ [11, p. 188]. 
Proofs can be found for the axiomatic approach of star (based on Conway 
semirings), but we followed the topology-based one, for which we did not 
find a published version. 


Proposition 1 (Super S) Let K be a strong topological semiring. For 
any series s,t € K((A*)), if st, (test)*, and (se +t-)* are defined and 
(Se + te)* = st(test)*, then (s + t)* = s*(ts*)*. 


Proof: This proof goes in several steps, with different constraints over s 
and t. From a formal point of view, it is actually ‘trivial’: a simple look at 
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the proof of Sakarovitch [24, Proposition III.2.6] shows that both expressions 
are formally equivalent. The real technical difficulty is semantic: ensuring 
that all the (infinite) sums are properly defined. 

We actually only need Item 4 below to establish Proposition 2. 


1. When s andt are proper. This is a well-known consequence of Arden’s 
lemma [24, Proposition III.2.5]. 


2. When s © K, andt is proper. This property holds when K is a strong 
topological semiring, and when s* is defined [24, Proposition III.2.6]. 


3. When s,t€ K. This result follows directly from the hypothesis of 
this property. Note however that s*(ts*)* = (s +t)* is verified in all 
the ‘usual’ semirings. 


e If K is a usual numerical semiring (i.e., Q,R, or more generally, a 
subring of C”), then s* is the inverse of 1 — s, i.e., (1 — s)s* = 
s*(1—s) = 1. To establish the result, we show that s*(ts*)* 
is the inverse of 1 — (s +t). By hypothesis, s* and (ts*)* are 
defined. (1 — (s + t))s*(ts*)* = (1 — s)s*(ts*)* — ts*(ts*)* = 
(ts*)* — ts*(ts*)* = (1 — ts*)(ts*)* = 1, which shows that (s + t)* 
is defined. 

e If K is a tropical semiring, say, (Z U {oo}, min, +, co, 0), then s* 
is defined iff s > 0, and then s* = 0, hence the result trivially 
follows. 

e IfK is the Log semiring, (Rt L400} ia oy OO; 0) where + og “= 
x,y + —log(exp(—xz) + exp(—y)). Then we get z* = log(1 — 
exp(—)). Again, one can verify the identity. 


4. When s € K and t is any series. By hypothesis, (ts*)* is defined, 
i.e., (t-s*)* is defined, so by Item 3, (s + t-)* is defined. 


(s+t)* =(stte+t,)* 

=(s+te)*(tp(s+te)*)* by Item 2, tp proper, (s +t-)* defined 
8" (tes")*(tys* (tes*)*)* by Item 3 
8*(t-s* + tys*)* by Item 2, t,s* proper, (t-s*)* defined 

= s"((te-+t,)8")* 

=s (is')° 
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5. When s is any series and t is proper. By hypothesis, s* is defined, 
so sz is defined. 


(s+t)* = (se + (Sp + t))* 
= s*((Sp + t)sz)* by Item 2, s, + t proper 
= si (s58. Fts.)" 
= 82(Sp8z)"(tsz(Sps:)")* by Item 1, s,s: and ts? proper 
= (s- + Sp)*(t(S- + 5p)*)" by Item 2 sz defined, s, proper 
s*(ts")” 


6. When s andt are any series. By hypothesis, s* is defined. 


(s+t)* =(s+te+tp)* 
=(st+te)*(tp(s+t.)*)* by Item 5, tp proper 
='s"(t28" (tps (tes")")" by Item 4, t- € K 
= s*(t-s* + tps*)* by Item 5, t)s* proper 
= sts | 


All the usual semirings (B, Fo, Q,R,Rmin, Log, etc.) are strong topo- 
logical semirings, in which if st, (t-st)*, and (s; + t-)* are defined then 
(se +te)* = st(test)*. 


Proposition 2 Let K be a strong topological semiring. Let s € K,t € 
K((A*)), if s*, (tes*)*, and (s+ t-)* are defined and (s + t-)* = s*(t-s*)* 
then (s+t)* = s* + s*t(s+t)*. 


Proof: The result follows from Proposition 1, and from (ts*)* 


=< 
(ts*)(ts")*: (8 + £)" = s*(ts")* = s*(c + (ts*)(ts")") = s* + 3t(s"(ts")*) 
s* + s*t(s+t)*. 


+ 


2.1.5 Tuple 


We suppose K is commutative. Let M and N be two monoids. The tupling 
of two series s € K((M)),t € K((N)), is the series s|t:=m|neé Mx Nth 
s(m)t(n). It is a member of K(M x N)). 
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Proposition 3 (Series Tupling is Bilinear) 
For all series s,s' € K(M)), t,t’ € K(N)), and all weights k € K, 


(st+s')|t=s|t+s'|t s|@+?t)=slt+s|t’ 
(ks) |t = k(s | t) s| (kt) = k(s | t) 


Proof: We prove the first equality. Let m|n€ Mx N. ((s+3')|t)(m|n) = 
(s + s’)(m)-t(n) = (s(m) + s’/(m)) - t(n) = s(m)- t(n) + s'(m)- t(n) = 
(s|t)(m|n) + (s'| t)(m|n) = (s|t+s'|t)(m|n). 

From now on, JM is a graded monoid of finite type, and K a commutative 
strong topological semiring. 


2.2. Weighted Rational Expressions 


Contrary to the usual definition, we do not require a finite alphabet: any 
set of generators G C M will do. For expressions with more than one tape, 
we require K to be commutative; however, for single tape expressions, our 
results apply to non-commutative semirings, hence there are two exterior 
products. 


Definition 1 (Expression) A rational expression E over G is a term built 
from the following grammar, where a € G denotes any non empty label, and 
k € K any weight: 


E:=0|1]a|E+E| (KE Elk) | E-E| E& | E/E 


Expressions are syntactic; they are finite notations for (some) series. 


Definition 2 (Series Denoted by an Expression) Let E be an expres- 
sion. The series denoted by E, noted [E]], is defined by induction on E: 


[0] :-=0 [1] =e [a] =a (E + F] := [E] + [FI [(*) E| = kIE] 
JECk) | = [EA [E-F] := [E]- [F] [E*] = El" jE | | = [E]] | (F] 
An expression is valid if it denotes a series. More specifically, there are two 


requirements. First, the expression must be well-formed, i.e., concatenation 
and disjunction must be applied to expressions of appropriate number of tapes. 


Derived-Term Automata of 
Multitape Expressions with Composition 145 


For instance, a + blc and a(b|c\d) are ill-formed®, (a|b)*|c + a|(b|c)* is well- 
formed. Second, to ensure that [F]* is well defined for each subexpression of 
the form F*, the constant term of [F]] must be starrable in K (Proposition 2). 

Let [n] denote {1,...,n}. The size (aka length) of a (valid) expression 
E, |E|, is its total number of symbols, not counting parenthesis; for a given 
tape number 7 € [k] the width on tape i, ||E||;, is the number of occurrences 
of labels on the tape 7, the width of E (aka literal length), ||E\| = viens llElli 
is the total number of occurrences of labels different from e. 

Two expressions E and F are equivalent iff E]] = [F]]. Some expressions 
are ‘trivially equivalent’, for instance E- ¢ and E. To simplify the examples, 
our expressions will always be simplified according to the following identities. 


Definition 3 (Trivial Identities) Any subexpression of a form listed to 
the left of a =>’ is rewritten as indicated on the right. 


E+0>E O+E>E 
(OK)ES0 (1g)/ESE (k)0S0 (K)(A)E=> (KA)E 
E(Ox) +0 E(1lx)=>E O(k) >0 E(k) (h) > E(kh) 
((K)E)(h) => (k)(E(A)) E(k) => (ke 
E-0=0 OrES0 
((k)'1)-E => (k)'E E+ ((k)*1) > E(k)’ 
si 
EJO>0 O|E=>0 = ((k)’E)| ((h)'F) > (kh)'E|F 
where E is a rational expression, 2€ GU {1} a label, k,h € K weights, and 


(k)'€ denotes either (k)£, or £ in which case k = 1x in the right-hand side 
of =>. 


These identities are taken from Sakarovitch [24]; they are discussed in 
Section 6.2. Note that linearity (‘weighted ACT: associativity, commutativity 
and (k)E + (h)E = (k +h)E) is not enforced; this is the role of polynomials. 


2.3 Rational Polynomials 


At the core of the idea of ‘partial derivatives’ introduced by Antimirov [3], 
is that of sets of rational expressions, later generalized in weighted sets by 


®° As discussed in the introduction, an implementation could accept them, as abbreviati- 
ons for ala + b\c and (ala|a)(b|c|d). But what sense could be given to alb + cldle? 


146 A. Demaille 


Lombardy and Sakarovitch [16], i-e., functions (partial, with finite domain) 
from the set of rational expressions into K \ {Ox}. It proves useful to view 
such structures as ‘polynomials of expressions’. In essence, they capture the 
linearity of addition. 


Definition 4 (Rational Polynomial) A polynomial (of rational expres- 
sions) is a finite (left) linear combination of expressions. Syntactically it is 
a term built from the grammar 


P :=0| (ki) OE1 @--- @ (kn) O En 


where k; © K \ {Ox} denote non-null weights, and E; denote non-null ex- 
pressions. Expressions may not appear more than once in a polynomial. A 
monomial is a pair (k;) © E;. The weight of E in P is written P(E). 


We use specific symbols (© and ®) to clearly separate the outer po- 
lynomial layer from the inner expression layer. Let P = @jejnj(ki) © Ei 
be a polynomial of expressions. The projection of P is the expression 
expr(P) := (k1)Ei +--+ + (kn)En (or 0 if P is null); this operation is perfor- 
med on a canonical form of the polynomial (expressions are sorted in a well 
defined order). Polynomials denote series: [P] := |expr(P)]. The terms of 
P is the set exprs(P) := {Ej,...,En}. 


Example 1 Let Ey := (5)1/1 4+ (4)ade*|x + (3)bde*|xz + (2)ace*|xy + 
(6)bce*|zy. Polynomial Py ajz = (2) © ce*|y @ (4) © de* | 1’ has two 
monomials: “2) © ce* | y’ and “4) © de* | 1’. It denotes the (left) quotient of 
[E:] by ala, and Py vq = (6) © ce* | y B (3) © de* | 1’ the quotient by b| x. 


Let P = Die {ny (ki) © E;,Q= D je[my (7) ©F; be polynomials, k a weight 
and F an expression, all possibly null, we introduce the following operations: 


P-F = G)(ki) © (Ei: F) 


i€[n] 
(k)P = CB) (kki) OE; P(R) = EB (hi) © (E(k) 
iE[n] i€[n] 
Pil=QM(k)OE|1 1/P=QD(k) ole 
iE[n] t€[n] 


PIQ= Gi (ki: hj) OE |F; 


(4,5) €[n] x [mr] 


Derived-Term Automata of 
Multitape Expressions with Composition 147 


Trivial identities might simplify the result. Note the asymmetry between left 
and right exterior products. The addition of polynomials is commutative, 
multiplication by zero (be it an expression or a weight) evaluates to the null 
polynomial, and the left-multiplication by a weight is distributive. 


Lemma 1 [P-F] =([P]-[F] = [(k)P] =(&)[P] — [P(&)] = [PI (a) 
[P| Qj = [PF] | [Q]. 


Proof: The proofs of the first three equalities are straightforward. The 
last one is a direct consequence of the bilinearity of tupling. 


[PiQl=] ‘any (ks Ij) © Es | Fy] 


(4,9) €[n] x [m] 
= So ki hy) [EI Fil 
(4,9) €[n] x [m] 
dn Ae hj) ( (Ed) | [Fi]) by Definition 2 
(4,9) €[n] x [m] 
= 62 (Fi) [EiD) | ( S> (hj) [Fi]) by Proposition 3 
i¢[n] JE[m] 
= (Be) o&] 1 [|B ) oF] 
i€[n] JE[m] 


= (PI! 1Q] 


2.4 Finite Weighted Automata 


Our definition is slightly unusual in its handling of the labels, because it is 
meant for single and multitape automata. 


Definition 5 (Weighted Automaton) A weighted automaton A is a tu- 
ple (M,G,K,Q, E,I,T) where: 
e M is a monoid, 
G (the labels) is a set of generators of M, 
K (the set of weights) is a semiring, 
Q is a finite set of states, 
I and T are the initial and final functions from Q into K, 
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e E is a (partial) function from Q x G x Q into K \ {0x}; 
its domain represents the transitions: (source, label, destination). 
An automaton is proper if no label is Ey. 


Our automata are ‘e-NFAs’: they may have spontaneous transitions, as 
we do not require ¢ ¢ G. 

The size of an automaton is its number of states: |.A| = |Q]. 

A path 7 is a sequence of transitions (qo, £1, q1)(q1,; £2, G2) +++ (Qn—1; Ln; Qn) 
where the source of each is the destination of the previous one; its source is 
U(m) = qo, its destination is T(m) = dn, its label is the word ¢(7) = &--++ ln, 


its weight is w(m) = E(qo,41,0)----: E(dn—1,n; dn), and its weighted label 
[17] is the monomial wil(7) = w(m)é(7). The set of paths of A is denoted 
Path(.A). 


A state q is initial if I(q) 4 Ox. A state q is accessible if there is a path 
from an initial state to g. The accessible part of an automaton A is the 
sub-automaton whose states are the accessible states of A. 

A computation cis a path 7 together with its initial and final functions at 
the ends: c := (I(t(7)), 7, T(7(m))), its weight is w(c) = I(e(m))w(a)T (7 (7)). 
The evaluation of word u by an automaton A, A(u), is the sum of the weights 
of all the computations labeled by u, or Ox if there are none. The behavior 
of A is the series [A] := ur A(u). 

Automata with spontaneous transitions may be invalid, if they have 
cycles of spontaneous transitions whose weight is not starrable [17]. 


Definition 6 (Semantics of a State) Given a valid weighted automaton 
A = (M,G,K,Q,E,I,T), the semantics of state q (aka, its future) is the 
series: 


[a] = T(q) + > wl(m)T (r(x) (1) 
m€Path(A)|q=i(7) 


Clearly, [A] = are T(q)[a]. 


Proposition 4 For any valid automaton A, we have: 


Id=T@+ SY EG eld] (2) 


LEG ,q'EQ 


The equivalence of Eqs. (1) and (2) can be seen as two different strategies 
of evaluation: the first one is by depth first (follow each path individually, 
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then sum their weights), the second one by breadth (starting from the set of 
initial states, descend ‘simultaneously’ each transition, and repeat). 

A simple proof by induction [8, Sec. 2.5] suffices in the absence of 
spontaneous transitions. With cycles of spontaneous transitions, we face 
infinite sums whose formal treatment requires arguments that go way beyond 
the scope of this paper, and the possibility that the automaton is invalid. 
This is in fact the core of the work of Lombardy and Sakarovitch [17]. 


3 Rational Expansions 


Given an expression such as a + (2)bc*, we want an algorithm to compute 
its derived-term automaton: 


To this end, we introduce expansions. They are comparable to some 
normal form for rational expressions. They highlight the labels by which an 
expression may ‘start’ (the first), and then the various possible continuations 
(a weighted set of expressions, aka, a polynomial). The expansion of a+ (2)bc* 
is a©® [(1) © 1] ©b© [(2) Oc"). 


3.1 Rational Expansions 


Definition 7 (Rational Expansion) A rational expansion X is a term 
X= (k) @ 4: © [Pi] ®--- @ Ln © [Pn] where k is a weight (possibly zero), 
£; € G are labels (occurring at most once), and P; non-null polynomials. 
The immediate constant term of an expansion X, noted Xg, 1s k. The firsts 
of X is f(X) = {4,...,ln} (possibly empty) and its terms are exprs(X) := 
Vien) exprs(Pi)- 


To ease reading, polynomials are written in square brackets. Contrary to 
expressions and polynomials, there is no specific term for the zero expansion: 
it is represented by (Ox), the zero weight. Given an expansion X, we denote 
by X¢ (or X(£)) the polynomial corresponding to @ in X, or the null polynomial 
if ¢ f(X). Expansions will thus be written: X = (Xs) @ Dye p(x) £© [Xe]- 
An expansion X can be projected as a rational expression expr(X) by 
mapping weights, labels and polynomials to their corresponding rational 
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expressions, and @/© to the sum/concatenation of expressions. Again, this 
is performed on a canonical form of the expansion: labels are sorted. Expan- 
sions also denote series: [X] -= |[expr(X)]}. An expansion X is equivalent to 
an expression E iff [X] = [E]. 

An expansion X is immediately proper if Xg = Ox; its immediate proper 
part, Xp, is the expansion which coincides with X but with a null immediate 
constant term; hence’ X = (Xg) @ Xp. An immediately proper expansion 
may denote an improper series: consider for instance X = € © [(3) © a*]; it 
is immediately proper (Xg = 0), yet [X](e) = 3. 


Example 2 (Example 1 continued) Let expansion X, := (5) @ alx © 
[Pi aja] © blz © [Py pj2]. Its immediate constant term is 5, and X, maps the 
generator a\x (resp. b|x) to the polynomial X;(a|x) = Py aje (resp. X1(blx) = 
P; pj2). X1 can be proved to be equivalent to Ey. 


Let X,Y be expansions, k a weight, and E an expression (all possibly 
null): 


X@Y = (Xs + Ys) 8D L0 [Xe @ Yi (3) 
le f(X)Uf(Y) 
(k)X == (kXg) ® B LO [(k)Xe] Xk) = (Xgk) @ BD LO [Xe(k)] (4) 
le f(X) lef (X) 
X-E:= B £© [Xe- E} with X immediately proper (5) 
ef (X) 
(XsYg) 
B(Xg) Be LY (1| Ye) 
CefY) 
XIY= 9 @Ys) Cille) © %e| 1) (6) 
lef (X) 
® Dae) © (Kel Ye?) 
LUE F(X)Xf(Y) 


Since by definition expansions never map to null polynomials, some firsts 
might be smaller that suggested by these equations. For instance in Z the 
sum of (1) 6a© [(1) © b] and (1) 6a© [(—1) © 9] is (2). 


"The (straightforward) definition of addition of expansions, ©, will be given below. 
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The following lemma is simple to establish: lift semantic equivalences, 
such as Proposition 3, to syntax, using Lemma 1. 


Lemma 2 [X @ Y] = [X]+[Y] [(k)X] = (&) [x] [X(k)] = [X](e) 
IX-E]=(XJ-(E) = [X1Y] = 1X11 YI] 
3.2. Computing the Expansion of an Expression 


We introduce a procedure to compute an (equivalent) expansion from an 
expression. 


Definition 8 (Expansion of a Rational Expression) The expansion of 
a rational expression E, written d(E), is defined inductively as follows: 


d(0) = (Ox) d(1) = (1x) d(a) =a© [(1x) © J] (7) 
d(E + F) := d(E) @ d(F) (8) 

d({k)E) = (k)d(E) d(E(k)) = d(E)(k) (9) 

d(E - F) = dp(E) - F @ (dg(E))d(F) (10) 

d(E*) := (dg(E)*) ® (dg(E)*)dp(E) - E* (11) 

d(E| F) = d(E) | d(F) (12) 


where dg(E) = d(E)g, dp(E) := d(E)» are the immediate constant term and 
immediate proper part of d(E). 


The right-hand sides are indeed expansions. The computation trivially 
terminates: induction is performed on strictly smaller subexpressions. Note 
that the firsts are a subset of the labels of the expression. 


Example 3 (Examples 1 and 2 continued) 
With Ey = (5)1|1+ (4)ade*|x + (3)bde*|x + (2)ace*|x y+ (6)bce*|xy, one 
has: 
(5) 
d(E1) = 4 Galz © [(2) © ce*|y @ (4) © de* |e] 
® blz © [(6) © ce*|y @ (3) © de* |e] 
=i (from Example 2) 


This expansion is the introductory example from Section 2. 
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Proposition 5 The expansion of a rational expression is equivalent to the 
expression. 


Proof: We prove that [d(E)] = [E] by induction on . expression. The 
Goue is straightforward for Eqs. (7) to (9) and (12), viz., [d(E|F)] = 
[@(E) | d(F)]] (by Eq. (12)) = [d(E)] | [a(F)] (by ee 2) = (E] | [F] (by 
ae hypothesis) = [E | F| (by Lemma 2) . The case of multiplication, 
Eq. (10), follows from: 


[aE FY] = [ap(E) Fo (as(E))- a] = [ap(E)] - TFT + (as (E)) - (40)] 
= [ao(E)] «OFT + (as(E)) OFT = ([as())] + (©) «FI 


= [(as(E)) + 4,()] TF = [4(E)] - [FI 
= [E] - [FI =[E-F] 


The case of Kleene star, Eq. (11), follows from Proposition 2. 


An expansion X is normal if ¢ ¢ f(Xp). For instance (1) + a© [(5) © 
(($)a)*] is normal, ¢ © [(1x) © ((5)a)*] is not. Both denote (($)a)*. 


Lemma 3 The expansion of an expression is normal. 


This is easily established by a simple verification on Definition 8. Ho- 
wever, with the composition operator, normality is no longer guaranteed 
(Section 5). 


3.3. Derived Terms of an Expression 


In this section, we prove that repeated computations of expansions are 
‘generated’ by a finite number of expressions, called the derived terms. The 
result, Lemma 7, will prove that our construct builds finite automata. 

For any sets of expressions S,T, let S| T := {E| Flees rer. 


Definition 9 (Derived Terms) The proper derived terms of an expres- 
sion E is PD(E), the set of expressions defined inductively below: 


PD(0) :=0 
D(1) = {1} 
D(é):={1} Wea 
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PD(E + F) := PD(E) U PD(F) 
PD((k)E) := PD(E) VkEEK 
PD(Eth) ) =4{Ek) | Ere PDUE)} VkEEK 
PD(E- F) = {E;-F | E; € PD(E)} U PD(F) 
PD(E*)2={1E;+E* | Eye PD(E}t 
PD(E| F) = (PD(E) | PD(F)) U ({1} | PD(F)) U (PD(E) | {1) 


where G is the set of generators. 
The derived terms of an expression E is D(E) := PD(E) U {E}. 


This simple inductive definition is similar to Def. 3 of Lombardy and 
Sakarovitch [16]. Later, the authors changed their definition of derived term 
to rely on derivatives with respect to words [2, Def. 3]. The original definition 
denotes the set of potential derived terms, the second one denotes the true 
derived terms (i.e., the actual states of the derived-term automaton). While 
the two concepts coincide in the case of basic operators, they differ in ours: 
tupling (and later composition) introduce many potential derived terms, 
most of them not appearing in the resulting derived-term automaton. This 
is why we do not use the name true derived term and the notation TD. 


Lemma 4 (Number of Derived Terms) For any k-tape expression E, 


IPD(E)| < TT (Ele +) 


i€[k] 


Proof: It is simple to check by induction on E that for all cases, except 
tuple, PD(E) < ||E|| (which is the classical result for single-tape expressions, 
see for example Lombardy and Sakarovitch [16, Theorem 2] or Angrand 
et al. [2, Theorem 3]). In the case of |, it is clear that |PD(E|F)| < 
({|PD(E)| + 1) - (/PD(F)| + 1), hence the result. 


Lemma 5 (Proper Derived Terms and Single Expansion) For any ez- 
pression E, exprs(d(E)) C PD(E). 


Proof: Established by a simple verification of Definition 8. 


The derived terms of derived terms of E are derived terms of E. In other 
words, repeated expansions never ‘escape’ the set of derived terms. 


Lemma 6 (Proper Derived Terms and Repeated Expansions) Let E 
be an expression. For all F € PD(E), exprs(d(F)) C PD(E). 
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Proof: This will be proved by induction over E. 
Case E = 0 or E=1. Impossible, as then PD(E) = 0. 


Case E=a. Then PD(E) = {1}, hence F = 1 and therefore d(F) = d(1) = 
(Ox), so exprs(d(F)) = @ C PD(E). 


Case E=G+H. Then PD(E) = PD(G) UPD(H). Suppose, without loss of 
generality, that F € PD(G). Then, by induction hypothesis, exprs(d(F)) C 
PD(G) C PD(E). 


Case E = (k)G. Then if F € ee )G) = a ), so by induction hypothesis 
exprs(d(F)) C PD(G) = PD((k)G) = PD(E). 

Case E = G(k). Then VF € PD(G(k)) = {G;(k) | G; € PD(G 
an i such that F = G,(k). Then d(F) = d(G,(k)) = 
exprs(d(F)) = exprs(d(G;) (k)). 

Since G; € PD(G), by induction hypothesis exprs(d(G;)) C PD(G), so 
by definition of the right exterior product of expansions (and polyno- 
mials), exprs(d(G;)(k)) GC PD(G(k)) = PD(E). 

Hence exprs(d(F)) C PD(E). 


)}, there exists 
d(G;)(k) hence 


Case E=G-H. Then PD(E) = {G;-H| G; € PD(G)} U PD(H). 


e If F=G;-H with G; € PD(G), then d(F) = d(G;-H) = d,(G;) - 
H® (dg (G;))d(H). 
Since G; € PD(G) by induction hypothesis exprs(dp(G;)) = 
exprs(d(G;)) GC PD(G). By definition of the product of an ex- 
pansion by an expression, exprs(dp(G;)-H) C {G;-H | G; € 

D(G)} C PD(G- H) = PD(E). 

e If F € PD(H), then by induction hypothesis exprs(d(F)) GC PD(H) 

C PD(E). 


Case E = G*. If F © PD(E) = {G;- G* | G; € PD(G)}, ie., if F = G;- G 
with G; € PD(G), then d(F) — d(G; : G*) = dy(G;) GO (dg (G;))d(G*), 
so exprs(d(F)) C exprs(d,(G;) - G*) Uexprs(d(G*)).8 We will show that 
both are subsets of PD(E), which will prove the result. 


8Given two expansions X1, X2, exprs(X: ® X2) C exprs(X1) U exprs(X2), but they may 
be different; consider for instance X1 = a © [(1) © 1] and Xz = a© [(—1) ©] with K = Z. 
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Since G; € PD(G), by induction hypothesis, exprs (dp (G; ye = exprs(d(G;)) 
C PD(G), so by definition of a product of an expansion by an expression, 
exprs(dp(G;) - G*) C {G; - G* | G; € PD(G)} = PD(E). 


By Lemma 5 exprs(d(G*)) C PD(G*) = PD(E). 


Case E=G|H. Let F € PD(E) = PD(G)|PD(H) U {1}|PD(H) U PD(G)| {1}. 


e Suppose F € PD(G)|PD(H), i-e., F = G;|H; with G; € PD(G),H; € 
PD(H). By induction hypothesis exprs(d(G;)) CG PD(G) and 
exprs(d(H;)) CG PD(H), hence by definition of the tupling of 
expansions (Eq. (6)) ae | d(H;)) © (PD(G) | PD(H)) U 
({1} | PD(H)) U (PD(G) | {1}) = PD(E). 

We have d(F) = d(G; | H;) = d(G;) | d(H;), so exprs(d(F)) = 
exprs(d(G;) | d(H;)) G PD(E). 

e Suppose F € {1} | PD(H), ie., F = 1]|H,; with H; € PD(H). By 
induction hypothesis exprs(d(H;)) © PD(H), hence by Eq. (6) 
exprs(d(1) | d(H;)) = exprs((1x) | d(H;)) = {1} | exprs(d(H,)) 
{1} | PD(H) ¢ (PD(1) | PD(H)) U ({1} | PD(H)) U (PD(1) | {1}) 


ie 


PD(E). 
We have ne 5 d(1 ye o ) | d(Hj), so exprs(d(F)) = 
exprs(d(1 H;)) © 


e The case Ab ae D(G) | ae is ee 
Lemma 7 (Derived Terms and Repeated Expansions) 
Let E be an expression. For all F € D(E), exprs(d(F)) C PD(E). 


Proof: Since D(E) = PD(E) U {E}, this is an immediate consequence of 
Lemmas 5 and 6. 


4 Expansion-Based Derived-Term Automaton 
The repeated computations of expansions build an automaton. 


Definition 10 (Derived-Term Automaton) 
The derived-term automaton of an expression E over G is the accessible 
part of the automaton Ag := (M,G’,K,Q,E,I,T) defined as follows: 

e Q is the set of rational expressions over G with weights in K, 

e [T=Er lg, 
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met oe a(E) (41) 
en Gie)) 


a(E)(41) 


7 


d(E) = (Eg) © 41 © [(ke, 1) O Ee 1 @+-- ® (Key,n) © Ex, n] 


Blbm © [(Kem1) © Een B+++ ® (kena) © Een] 
we 
dg(E) d,(E) 


Figure 2: Initial part of the derived-term automaton of E. This figure is 
somewhat misleading: some Ey; might be equal to an Ey; with ¢ 4 @, or 
E — but never another Ey ;. In other words, from a given state, transitions 
with different labels may reach common states. 


e E(F,¢,F’) =k iff e€ f(d(F)) and (k) © F’ € d,(F)(£), 
e T(F) = dg(F). 


The Fig. 2 illustrates the process. 


Even if ¢ ¢ G, the derived-term automaton may have spontaneous 
transitions (G = {¢} UG). These provisions will be used in Section 5. 


Example 4 (Examples 1 to 3 continued) Fig. 1 shows the derived-term 
automaton of E, from the introductory example (Section 2 and Example 1). 


We must justify Definition 10 by proving that this automaton is finite. 


Theorem 1 For any k-tape expression E, |Ae| < [Tiepy(IlElli + 1) +1. 


Proof: First observe that the states of Ag are members of D(E) (this follows 
from a simple examination of the repeated computations of expansions in 
Definition 10). Then Lemma 7 allows to conclude. 
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Example 5 Let <A, be the 
derived-term automaton of the 
k-tape expression aj |---| az. The 
states of A, are all the possible ate 
expressions where the tape i 
features 1 or ax, except 1|---| 1. a i)“ ¢ 


Therefore |Ay| = 2* — 1, and rH 
THieuy (Ele + 1) = Qk, ae 


Az, the derived-term automaton of 
a* | b* | c*, is depicted on the right. 


Theorem 2 If valid, any expression E and its expansion-based derived-term 
automaton Ag denote the same series, i.e., |. Ag] = JE]. 


Since the expansions are normal (Lemma 3), the firsts of the immediate 
proper part exclude ¢e, this automaton is therefore proper. As a consequence, 
the proof of Demaille [9, Theorem 2] would suffice to establish this result. 
However, with the introduction of the composition operator in Section 5, 
expansions may no longer be normal and automata proper. We need a more 
powerful proof. 


Proof: We show that the semantics of the states of Ag Eq. (2) and of the 
expressions in D(E) define the same system of linear equations. 

The Definition 10 shows that each state gg of the Ag has the following 
semantics: 


[ar] = ‘ keer € [Ge] (13) 


lef (a(F)) 
(k) OF’ €d(F) (2) 


Besides: 
[F] = [a(F)] (by Proposition 5) 
=| ® coaro] 
lef (d(F)) 
= S> efaF)(O] 
le f(d(F)) 


> [| <p (kei) © Fea | 


fef(d(F)) (kes) OF e,s€d(F)(E) 
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= a é > ke [Fe 


lef(d(F))  (kes)OFe,c€d(F) (4) 


= > kel [Fe] (14) 
le f(a(F)) 
(kei) OF e,4€d(F) (2) 


One can then verify that Eqs. (13) and (14) define the same system of 
linear equations, hence [Ag] = [E]. 


Example 6 Let E2 = (at |a+bt | y)*, where Et = EE*. Its expansion is 
d(Ez) = ele © [(1) OJ 
® alz © [(a* | 1)(a" | a+" |y)*] 
® bly © [(O* |1)(a" |a +o" |y)*] 
= ele © [(1) © 1] Gala © [(a* | 1)Ea] 6 bly © [(0* | 1JEa] 


Its derived-term automaton is: 


—>(E2 = (a* |x + b*|y)* 


bly 


It is straightforward to extract an algorithm from Definition 10, using a 
work-list of states whose outgoing transitions to compute (see Algorithm 1). 
This approach admits a natural lazy implementation: the whole automaton 
is not computed at once, but rather, states and transitions are computed 
on-the-fly, on demand, for instance when evaluating a word, or during a 
composition or a shortest path traversal, etc. One can apply transformations 
on the expansion before extracting transitions from it, for instance to generate 
deterministic/sequential automata [8, Section 4.2]. 


5 Support for Composition 


Our goal is to introduce a composition operator in rational expressions, so 
that, for instance, alz @ z|c is equivalent to alc. The construct is not limited 
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Input :E, a rational expression 
Output: (£,/,T) an automaton (simplified notation) 


TCE) = ae // Unique initial state 
Q := Queue(E) ; // A work list loaded with E 
while Q is not empty do 
E:=pop(Q) $ // A new state/expression to complete 
=e) // Compute the expansion of E 
DCE) t= Kes // Final weight: the constant term 


foreach a© [X(a)]| © X do // For each (first, polynomial) in X 
foreach (k) OF € X(a) do =—_// For each monomial of X(a) 


hs Se beta // New transition 
if F ¢ Q then // F is a new state... 
is push(Q, F) ; // ...to complete later 


Algorithm 1: Building the derived-term automaton. The set of states 
is implicitly grown when transitions are added. 


to two-tape automata, and the ‘zipping’ could be performed on any tape, 
so for instance a|x|c(1 @ 2)A|B|x would denote a|A|B|c (or alc|A|B?). To 
avoid useless complications, we limit the presentation to the simple case of 
two-tape expressions, where composition ‘zips’ the last tape of the left with 
the first of the right. We also require both tapes to have the same type. As 
a consequence, composition is an internal law. 

Let A be an alphabet. By A’ we denote {e}U A. We use a,b,... to 
denote letters of A, and @,¢’ to denote labels of A’. 


5.1 Composition of Rational Series 


Let A be an alphabet, and s,t € K(A* x A*)) two series. The composition 


of s with t is the series s@t = m|n > Yo eng 8(m|x) - t(a|n), which also 
belongs to K(A* x A*)). 


Proposition 6 (Series Composition is Bilinear) 
For all series s,s',t,t'! € K((A* x A*)), and all weights k € K, 


(sts')@t=s@t+s'Q@t sQ@(t+?)=s@t+s@?r 
(ks) @t = k(s @t) s @ (kt) = k(s @t) 
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Proposition 7 For all series s,t € K(A* x A*)), and labels £,', 6), 2 € A’, 


s@t ifl—e 
((e/€)-s)Qt flAc =e 
sQ@((lle)-t) fl=e,l #e 


0 otherwise 


((4|2) - s) @ ((E'|é2) - t) = (Ail é2) - 


Proof: With the convention that terms with undefined words (e.g., a~'b) 
are null, we have: 


(((12) - 8) @ ((E'|e2) - t)) (mn) = SF (410) - 8)(mla)((E'le2) - t) (an) 


rc A* 


= SO s(&p!mle-*a)t(* 23") = (41/2) S$) s(m|e~*x)t(¢*2|n) 


xe A* xe A* 


Then we reason by cases: 
e if =f’, then: 


Dd s(mle*a)t(E*a|n) = SF s(mly)t(y|n) 
xe A* ye A* 
= (s @t)(mJn) 


e if Ae and @ =e, then: 
Yo s(mlet*ax)t(ta|n) = SF (ele) - 8) (mlz) ()(aIn) 


xe A* xe A* 
= (((e 


e the case =e and @ #¢ is similar. 
e if 2+ @ and neither is the empty word, then at least one of ~!x or 
(1x is undefined. 


¢) +s) @t)(mJn) 


5.2 Composition of Weighted Rational Expressions 


To the Definition 1, we add a clause E ::= E@E. Its semantics is defined by 
JE @ F] = [EJ@]F]. Its trivial identities are: 


E@030 O@ES0 _— ((k)’1)@((h)'1) => (kh)‘1 


where, as in Definition 3, (k)’1 denotes either (k)1, or 1 in which case k = 1x 
in the right-hand side of >. 
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The definition of composition of polynomials follows from the bilinearity 
of composition. 


P@Q= Gi (hi: hj)OE @F; 


(4,5) €[n] x [m] 


With a proof similar to the case of | in Lemma 1, we can prove that for any 
polynomials P and Q, [P @ Q] = [P] @ [Q]. 


5.3 Composition of Rational Expansions 


The definition of expansion composition may look straightforward: just 
‘zip’ on the common letters, and compose the corresponding expressions. 
For instance ((a|x) © [E] @ (a|b) © [E’]) @ ((2|b) © [F] © (|b) © [F’]) results 
in (a|b) © [E @ F]: only x appears both in output and input. 

However, the empty word makes things more interesting. Consider 
for instance (ale © [e|z]) @ (|b © [e|d]): it denotes (a|x) @ (a/b) = alo. 
Therefore the empty word must be ‘zippable’ with any other label; this 
applies to the empty-word as output of the left-hand side: (ale © [P]) @ 
(z|b © [Q]) => ale © [P @ ((z|b)Q)], and as input of the right-hand side: 
(alx © [P]) @ (e]b © [Q]) = e|b© [((a|x)P) @ Q]. However, we must be careful 
not to pair twice ale with ¢|b, once for € as right output label and once for 
€ as left input label: that would denote (2)a|b instead of a|b. Hence the 
following definition: 


(XsY¢) 
@ @D (Elf) o [(Xs)(1 @ Y-,)] 
ele f(Y) 
cay © D Lie) ol¥s)Xeje @ VI 
a flee f(X) 
Xo @ Y eto if L a v 
e @D (A\f)e Xue (Cle)\You, ifl=e,l #e 
£,\€e f (X) (el€)Xe,1¢ @ Ye\e5 if 2 # Bye 7G. 
L'\loef(Y) 


(15) 


where @, €;, and £g, are labels. 
The following lemma is proved using Propositions 6 and 7. 


Lemma 8 [X @Y] = [X] @ [Y]. 
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To compute the expansion of an expression with composition, Defini- 
tion 8 only needs one additional case: 


d(E @ F) := d(E) @d(F) 


Example 7 Consider the introductory example in Zmin. Let F = (ala+b|b+ 
(1)(1|I + [ab]|.S'))* @ (ala + b]b + I|[ab] + S|1)*. Its derived-term automaton 
is exactly the automaton from Mohri [20, Figure 4]: 


(O)ala, (0)5|b, (1)ela, (1)e|, (1)ale, (1) ble 
O80 


Expansions of expressions with composition may be not normal, which 
will result in derived-term automata with spontaneous transitions. 


Example 8 Let E := ((k)1|a)° and F = ((h)aa| 1)". The derived-term 
automaton of E @F is: 
(k)ele 


—+(EaF) eae E @ (a|1)F 


Theorem 3 (Theorem 2 with Composition) [f valid, any multitape ex- 
pression with compositions E and its expansion-based derived-term automaton 
Ag denote the same series, i.e., [Ag] = [E]. 


Proof: Because it already ‘supports’ automata with spontaneous transitions, 
the proof of Theorem 2 still applies here. We must however justify that the 
automaton is indeed finite. 

With the convention that (e|¢)E = E, we can define the proper derived 
terms of E@F as: PD(E @F) == {e|€}pe 47 PD(E) @ {€le}pe42PD(F). With 
this definition, Lemmas 5 and 6 apply to expressions with compositions, 
which proves that the set of derived terms of an multitape expression with 
composition is finite. 


Contrary to expressions without composition, the procedure may success- 
fully build an invalid automaton. For a start, consider the automaton of 
Example 8. This automaton is valid in B (as any automaton...), but might 
be in Q depending on the starrability of the weight k”h. In all the cases, the 
validity of the automaton is equivalent to the validity of the expression. Un- 
fortunately, there exists cases where this procedure builds invalid automata 
from valid expressions (Section 6.1). 
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6 Discussion 


This section addresses several issues: 

e not all the generated automata are valid (Section 6.1), 

e the expressions may be normalized before and during the computations 
(Section 6.2), 

e the computations can be simplified by relying more on spontaneous 
transitions, at the cost of creating useless states (Section 6.3), 

e it is possible to keep simple computation and generate the same auto- 
mata (Section 6.4), 

e an efficient implementation of the procedure must pay attention to 
some issues (Section 6.5), 

e the generated automata are small (Section 6.6), 

e the tupling operator can be supported by the derivative-based compu- 
tation of the derived-term automaton (Section 6.7). 


6.1 On the Validity of Automata 


The computation of the expansion of product and star of expressions are 
quite involved: 
d(E- F) := d,(E) - F ® (dg(E))d(F) (Eq. (10)) 
d(E*) = (dg(E)*) © (ds(E)*)dp(E) - E* (Eq. (11)) 
whereas some simpler versions enjoying the freedom to use non-normal 
expansions (see Section 3.2) suffice: 
d(E- F) 


-F d(E) 
d(E*) 


-F 
(1x) @ d(E) - E* (17) 


They generate arguably more natural automata, but with spontaneous 
transitions. 


Eqs. (16) and (17) Eqs. (10) and (11) 


a b c a b c 
b c 
abc" _@--@— ; @— re) e- 
c 


(Aye 
(a1) —@e OP 
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Also, a significant advantage of Eq. (17) over Eq. (11) is that its 
correctness is straightforward to prove (it follows from s* = 1+ ss*), while 
justifying Eq. (11) required the Super S property (see Propositions 1 and 2), 
whose proof involve topological arguments. With Eqs. (16) and (17) all these 
‘details’ are delegated to the spontaneous transition removal procedure, as 
discussed by Lombardy and Sakarovitch [17] for instance. 

Therefore, Eqs. (10) and (11) may appear as mere optimizations of 
Eqs. (16) and (17): they generate automata that have fewer spontaneous 
transitions. 

Alas, we are then exposed to the same problems as the techniques 
that start from the Thompson automaton to compute different types of 
automata [1]: for some valid expressions, we generate invalid automata. 
For instance in Q, the expression (a* + (—1)1)* is valid, as [a* + (—1)1] is 
proper, yet its Thompson automaton is invalid, as it contains a spontaneous 
cycle whose weight, 1, is not starrable: 


(o) 
~ 
| 

a 
= 
o 
y 


| 
) 
; 
; 


mM 


mM 


mM 


The following expressions show invalid automata built by Eqs. (16) 
and (17) and the corresponding valid ones using Eqs. (10) and (11). 


Eqs. (16) and (17) Eqs. (10) and (11) 


Since it may also generate non-normal expansions, our handling of the 
composition may generate invalid automata from valid expressions too; for 
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instance with E := (1|ab @ ab|1 + (—1)1|1)*: 


(-Lele 
ele 


—+(E) (1\b @ B|1)E 


ele 


However, this can never happen in positive semirings. 


6.2 Identities on Expressions 


Small is beautiful. The smaller the automaton, the better. Therefore, it is 
natural to be eager at simplifying the expressions, and apply all the possible 
transformations that help reducing the size of the automaton [22]. 

Yet we chose relatively few identities: basically, those of Definition 3 are 
about the constants (0,1, 0x, lx). The identities were chosen to avoid useless 
clutter in the examples. Compare for instance the derived-term automaton 
of a* with, and without the trivial identity 1-E => E: 


a a 
_.@- “e 
Eliminating zeroes (0,0) allows to accept expressions that contain an 
invalid but useless part. For instance a+ (0)1* + 01* processed without iden- 


tities would fail in Q. Simplifying zeros also avoids creating non-coaccessible 
states (consider abc0 for instance). 


However, none of the trivial identities is needed for the procedure 
to terminate: everything is taken care of by the polynomials. Identities 
are not even needed to guarantee that expansions of expressions (without 
composition) are normal, i.e., that the derived-term automaton is proper. 


6.3. Denormalized Expressions 


Both theory and implementation are simpler with denormalized expansions, 
whose immediate constant term is moved into the terms of the empty word. 
Consider for instance the expansion of a*(2): 


(2) € a©® [(1K) © a*(2)] normal expansion (18) 
E © [(2) ©1] 6a © [(1g) © a*(2)] denormalized expansion (19) 
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With denormalized expansions, Eqs. (3) to (6) and (15) can be simplified 
into: 


XY =QPlo ke Yi 
le f(X)Uf(Y) 


k)X = Pot X(k) = Lo [Xe(k)] 


lef (xX lef (X) 
X-E:= lo [XE] 
lef (X) 
X1Y = DEle) © (Xl Ye) 
lef (X) Ef) 
Xe,\¢ @ Yo lo if“@=f 
X@Y i= (¢;|€2) © Xe,\¢ @ (fe E)Y erey filet fe 
a ie (€l)Xe,j2 @ Vere, ifexel =e | 
teeny) 


With adjusted definitions of immediate constant term (the weight 
associated to 1 in the term of ¢) and immediate proper part, the remainder 
of the automaton construction procedure remains the same. 


However, this introduces new terms in the case of composition. Consider 
expansions X := (1) and Y := a|b© [1]. Their (normal) composition with no 
identities is 0. The denormalized expansion of X is X’ := e|b © [ele @ ale]. 
These results yield two different automata: 


ele 


e|b 
—>(1]1 @alb —>(1|1 @alb | 1]1 @ajl 


The second automaton includes a (useless) spontaneous loop which is 
invalid in Q for instance. In this case, stronger identities on the composition 
simplifies the expression 1|1@a|b into 0, which solves the problem, but in other 
cases these spurious transitions remain. For instance, the expression from 
Example 8 yields the following automaton with denormalized expansions: 
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(k)ele hele ele 
(1la)E@1|1) (1]1 @ (al1)?F) (1]1 @ (a1)F 
ele ele ele 
With denormalized expansions, the introductory example (Example 7) 
yields eight such useless states, in addition to the single useful one. 


6.4 The Endmarker 


Both types of expansions (denormalized or not) have different pros and cons. 

With (non denormalized) expansions (Eq. (18)) the constant term has 
a different nature from the rest of the expansion, which lacks elegance. The 
equations are somewhat complex. 

With denormalized expansions (Eq. (19)) the immediate constant term 
is buried with other derived terms following the empty word, which leads to 
convoluted definitions of the immediate constant term and of the immediate 
proper part. The interpretation of the latter is somewhat clumsy: sometimes 
they denote final states, sometimes spontaneous transitions. Besides, hiding 
the (immediate) constant term in the derived terms of ¢ blurs the important 
distinction between normal expansions (that yield automata without sponta- 
neous transitions) and non normal ones. Finally, the generated automata 
may have many useless states (Section 6.3). 

These concerns can be addressed if we introduce an endmarker (aka, 
end of tape symbol, or terminator), $, added at the end of the expression. 
For instance the expansion of a*(2)$ is: 


$ © [(2) © 1] 6a© [(1K) © a*(2)] with endmarker 


to compare with Eqs. (18) and (19). (Expressions/automata with/without 
endmarker are equivalent [24, Proposition IV.5.1 p. 579].) We keep the 
simpler and more regular equations of denormalized expansions: the constant 
term becomes a regular weight, associated to the only possible derived term 
in the polynomial of $: 1. And we also avoid the useless states, as with 
non-denormalized expansions. 

This also allows to simplify the construction of the derived-term automa- 
ton. In Vesn for example [10], the initial and final weights are implemented 


168 A. Demaille 


as weights of special transitions from the unique preinitial state to the initial 
states, and from the final states to the unique postfinal state, all labeled 
with the endmarker (which therefore also servers as a beginmarker). 


a)" 


In the mathematical definition of an automaton, this corresponds to 
the replacement of the initial and final functions, J and T’, by two constants: 
the pre and post states. Starting with an endmarker at both ends, $E$, 
Definition 10 can then be simplified as: 

e Q is the set of rational expressions on alphabet A with weights in K, 
e E(F,£,F’) =k iff (k) OF’ € d(F)(0), for all labels @ € {$,c} UA. 


Using the endmarker, the simplified definitions of Section 6.3 can be 
used in place of Eqs. (3) to (6) and (15) and yield the same automata. This 
vastly simplifies the implementation. 


6.5 Implementation Issues 


In an implementation, a single recursive call to d(E) suffices for Eqs. (10) 
and (11), from which dg(E) and d,(E) are obtained; expansions are computed 
only when needed. So they should rather be written: 


d(E-F) :=1let X= d(E) in if (Xg) # Ox then X, - F  (Xg)d(F) else X, - F 
d(E*) := let X = d(E) in (X§$) @ (X§)X,- E* 


Besides, existing expressions are referenced to, not duplicated. In the 
previous piece of code, E* is not built again, the input argument is reused. 

Identities that enforce right-associativity of the product are a strong 
optimization that saves recursive calls. Consider ((ab)c)d; computing its 
expansion requires that of (ab)c is needed, which requires that of (ab) which 
requires that of a, which is a © [(1xK) © 1], that we multiply by b to get 
a © [(1K) © 16], then multiplied by c, and finally by d, which results in 
a © [(1K) © ((1b)ec)d]. Note that ((1b)c)d is still left-associative and will 
require similarly deep computations. On the contrary, the expansion of 
a(b(cd)) is computed in a single step: a © [(1xK) © 1(b(cd))]. 

At each step of the construction of the derived-term automaton we 
compute the expansion of an expression, extract its terms and add transitions 
to the states of these terms. It it therefore critical to use an efficient structure 
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Derived-Term Inductive 
Expression #5 #T #5 #T 
({ab] + (1) (e | [ab] + [ab] | €))* 1 6 7 42 
[ab]*((2)(a|b + bla) + (1)(e | [ad] + [ab] | e))* 2 414 9 60 
[ab] + (1) (elf + (a + 8) | S))* 1 5 6 30 
([ab] + Sle + I | [ab])* 1 5 6 30 
({ab] + (1) (e|L + [ad] | S))*@([ab] + Sle + I | [ab])* 1 6 7 42 
(4)ade*|x + (3)bde*|x + (2)ace*|xy + (6)bce*|xy 4 7 13 16 
a + (2)(bc*) 3 3 4 4 
a* | b*| ct | d* |e Si. 21g 32 242 
(at |a+ bt | y)* 3 8 5 14 
((k)e | a)*Q@((h)aa | e)* 2 2 3 3 


Table 1: Number of states and of transitions of the derived-term and in- 
ductive automata for the expressions used in this paper. We used traditional 
abbreviations, implemented in Vesn: [ab] := a+ 6, and single-tape expressi- 
ons in multitape context denote partial identities, e.g., a + (2)(bc*) denotes 


(ala) + (2)((b|)(cle)*). 


to store and retrieve the derived terms. Hash tables are well suited for this 
task. 


6.6 Performances 


We claimed that this construction builds small automata. On single tape 
expressions, it is well known that the size of the standard automaton (aka 
Glushkov automaton) of an expression E is exactly ||E|| + 1 [6], and that the 
derived-term automaton is at most ||E|| + 1 but ‘often’ much smaller. 

In Vcsn we implemented inductive, a generalization of the recursive 
implementation of the computation of the standard automaton of an ex- 
pression (see Lombardy and Sakarovitch [16, pp. 163-164] for instance) with 
support for the | and @ operators, and compared the sizes of the automata 
for the expressions used as examples in this paper. The results are presented 
in Table 1. The derived-term automaton has never more states or transitions 
on these examples. 

Benchmarks ran on generated expressions show similar results (see http: 
//vcsn.1rde.epita.fr/dload/2.6/notebooks/SACS-2017.htm1). There, 
the speed of both implementations are also compared. 
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6.7 Multitape Derivatives 


We reproduce here the definition of constant terms and derivatives from 
Lombardy et al [16, p. 148 and Def. 2], with our notations and added support 
for multitape expressions. To facilitate reading, weights such as the constant 
term are written in angle brackets, although so far this was reserved to 
syntactic constructs. 


Definition 11 (Constant Term and Derivative) 


c(0) := (Ox), 0,0 := 0, (20) 
e(1) = (lie), OA2= 9%, 
c(a) := (Ox), Va € A, Oab :=1 ifb=a, 0 otherwise, (21) 
c(E+ F) = c(E) + c(F), Oa(E + F) := OE 6 OuF, (22) 
c((k)E) := (k)c(E), Oa((k)E) = (k)(OaE), (23) 
c(E-F) = c(E) -c(F), Oa(E+F) = (OgE)-F @ (c(E))OaF, (24) 
(Ee) =< E)*, Oge” =<e(E)")(G,E)*E*, (25) 
e(E| (PF) = e(E) -e(F), Ogjo(E | F), = OaE | OoF, (26) 


OnE | Fs = (c(F)) (OaE | 1), 
Oz\o(E | F), = (e(E))(1 | OoF). 


where Eq. (25) applies iff c(E)* is defined in K. 


From an implementation point of view, Eq. (26) leads to repeated 
computations of 0,E and of 0,F, unless one would cache them, but that’s 
what expansions do. 


Lemma 9 For any expression E (without composition), d(E)(e) = c(E), and 


d(E)(a) = aE. 


Proof: A straightforward induction on E. The cases of constants and 
letters are immediate consequences of Eqs. (20) and (21) on the one hand, 
and Eq. (7) on the other hand. Equation (8) matches Eqs. (22) and (23). 
Multiplication (concatenation) is again barely a change of notation between 
Eq. (10) and Eq. (24), and likewise for the Kleene star (Eqs. (11) and (25)) 
and tuple (Eqs. (12) and (26), using Eq. (6)). 

Note that, if we were to define the derivative with respect to the empty 
word as the constant term, i.e., 0-E := c(E), then the previous definition 
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would simplify, for some operators, to: 
Og(E + F) = OpE+ OgF Og((k)E) = (k)(OcE) Oger (E| F) = Oe(E) | Ov (F) 


where for any weights k,k’,k | kh’ :=k-k’. 

Note that these derivatives are no longer equivalent to the left quotient 
of the corresponding language. Consider F = (a* | 1)(at | x + b* | y)*: 
the language it denotes includes ably, yet Oq\,F = (Ox). Albeit surprising, 
this result is nevertheless sufficient as can be observed in the derived-term 
automaton in Example 6: while the state (a* | 1)(at | 2+b* | y)* does accept 
words starting with a on the first tape, and y on the second, an outgoing 
transition on aly would result in a more complex automaton. 


7 Related Work 


This paper is about an algorithm to convert an expression into automata, 
and more specifically about multitape expressions. 


7.1 From Expression to Automaton 


Automata and rational (or regular) expressions share the same expressive 
power [14]. This fact made rational expressions an extremely handy practical 
tool to specify some rational languages in a concise way, from which acceptors 
(automata) are built [25]. 

There are numerous algorithms to build an automaton from an expres- 
sion starting with Glushkov [12], McNaughton and Yamada [19]. Brzozowski 
[4] introduced the idea of derivatives of expressions as a means to construct 
an equivalent automaton. The method applies to extended (unweighted) 
rational expressions, and constructs a deterministic automaton. Antimirov 
[3] modified the computation to rely on parts of the derivatives (‘partial 
derivatives’), which results in nondeterministic automata. 

Lombardy and Sakarovitch [16] extended this approach to support weig- 
hted expressions; independently; with different foundations, Rutten [23] 
proposed a similar construction. Caron et al. [5] introduced support for 
(unweighted) extended expressions. Demaille [8] provides support for weigh- 
ted extended expressions; expansions, originally mentioned by Brzozowski 
[4], are placed at the center of the construct, replacing derivatives, to gain 
independence with respect to the size of the alphabet, and efficiency. 
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We are particularly interested in the derivative-based family of algo- 
rithms, because they offer a very natural interpretation to states (they are 
labeled by an expression that denotes the future of the states, i.e., the 
language/series accepted from this state), and provide easy support for 
on-the-fly conversion. 


7.2  Multitape Expressions 


Multitape automata, including transducers, share many properties with 
‘single-tape’ automata, in particular the Fundamental Theorem [24, The- 
orem 2.1, p. 409]: under appropriate conditions, multitape automata and 
rational (multitape) series share the same expressive power. 


Multitape rational expressions have been considered early [18], but “an 
n-way regular expression is simply a regular expression whose terms are n- 
tuples of alphabetic symbols or <” [13], e.g., (e|a+e|b)*, but not (e|(a+b))*. 
Kaplan and Kay [13] do consider the full generality of the semantics of 
operations on rational languages and rational relations, including x, the 
Cartesian product of languages, and even use rational expressions more 
general than their definition. They do not, however, provide an explicit 
automaton construction algorithm, apparently relying on the simple inductive 
construction (using the Cartesian product between automata). Our | operator 
on series was defined as the tensor product, denoted ®, by Sakarovitch [24, 
Sec. III.3.2.5], but without equivalent for expressions. 


Makarevskii and Stotskaya [18] define multitape derivatives, but (i) 
in the case of expressions over tuples of letters, and (ii) only when in so- 
called ‘standard form’, for which he notes “no method of constructing [an] 
n-expression in standard form for a regular n-expression is known.” 


We first introduced multitape expressions and their derived-term au- 
tomata in Demaille [9]. This paper extends this work: the base theory is 
generalized to support spontaneous transitions, which we used in Section 5 
to introduce support for composition. 


Constructions of the derived-term automaton with completely different 
grounds have been discovered |1, 7]: they do not rely on derivatives at all. 
It is an open question whether these approaches can be adapted to support 
a tuple or a composition operator. 
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8 Conclusion 


Our work is in the continuation of derivative-based computations of an 
automaton from an expression [3—5, 16]. However, we replaced the derivatives 
by expansions, which lifted the requirement for the monoid of labels to be 
free. 

This freedom allowed us to generalize the computation of the derived- 
term automaton to expressions with a tupling operator (a|b) and a composi- 
tion operator (a|xz @ x|b). This procedure generates small automata. 

Compared to the derivative-based approach, expansions allowed simpler 
proofs, and a more efficient implementation. 

Vesn° implements the techniques exposed in this paper. Our future 
work aims at other operators, and studying more closely the complexity of 
the algorithm. 
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